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FIRST  ORDER  AUTOREGRESSIVE  GAMMA  SEQUENCES 
AND  POINT  PROCESSES 


* 

D.  P.  Gaver 


and 
P.  A.  W.  Lewis 


Naval  Postgraduate  School 
Monterey,  CA   93940 


1.   INTRODUCTION 

The  Poisson  process  is  a  basic  model  for  point  processes 

(series  of  events)  and  can  be  characterized  as  a  process  in  which 
the  intervals  between  events  are  independent  and  exponentially 
distributed.    In  one  of  the  earliest  papers  on  point  processes 

(Wold,  1948)  an  attempt  was  made  to  generalize  the  Poisson  process 
by  obtaining  dependent  but  marginally  exponentially  distributed 
intervals  between  events.   Similarly  Cox  (1955)  attempted  to 
obtain  a  sequence  of  random  variables  with  conditionally  exponential 
distributions.   Neither  of  these  attempts  to  generalize  the  Poisson 
process  led  to  analytically  tractable  results.   In  principle, 
of  course,  it  is  simple  to  generate  a  sequence  of  marginally 
exponentially  distributed  random  variables  with  Markov  dependence 
if  a  bivariate  exponential  random  variable  is  available.   However, 
despite  the  recent  discovery  of  many  bivariate  exponential  random 
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variable  generation  schemes  [Marshall  and  Olkin  (1967),  Hawkes  (1972); 
Gaver  (1972);  Downton  (1970)],  none  seem  to  have  simple  enough 
properties,  conditional  and  otherwise,  to  lead  to  analytically 
and  computationally  tractable  models  for  a  non-independent 
(Markovian)  and  easily  simulated  sequence  of  marginally  exponentially 
distributed  random  variables.   Interest  in  the  aforementioned 
processes  also  may  have  been  damped  by  the  development  of  physically 
motivated  point  process  models  such  as  the  Poisson  cluster  process 
and  the  self-exciting  process,  for  example  see  Neyman  and  Scott 
(1972).   Unfortunately,  many  of  these  point  process  models  are 
somewhat  awkward  to  handle  analytically  or  computationally. 

In  this  paper  we  show  that  if  one  starts  with  the  usual 
linear,  additive  autoregressive  equation  (first-order  stochastic 
difference  equation) 

X   =  pX   .  +  e   ,         n  =  0,  +  1,  +  2,  . . .    (1.1) 
n    K  n-1     n  —    — 

where  the  innovation  sequence   {e  }   is  one  of  i.i.d.  random  vari- 

J         n 

ables,  then  there  is,  for   0  <_   p  <  1,  a  distribution  for  the   e  's 
such  that  the   X  's  have,  marginally,  an  exponential  distribution. 
The  resulting  exponential  autoregressive  process  (EAR1)  has  several 
attractive  features.   First,  the  sequence  is  obtained  as  an  additive 
random  linear  combination  of  random  variables,  and  is  easy  to 
simulate,  i.e.  to  realize  on  a  computer.   In  simulating  a  queue, 
for  instance,  one  can  easily  obtain  sequences  of  correlated  service 
times  or  interarrival  times,  with  respectively  an  i.i.d.  exponential 
sequence  or  Poisson  process  as  a  special  case   (p  =  0).   These 


correlated  sequences  are  useful  for  checking  for  the  sensitivity 
of  standard  queueing  results  to  departures  from  independence,  and 
for  other  robustness  studies. 

A  second  feature  is  that  the  EAR1  process  (1.1)  is  analyti- 
cally tractable;  one  can,  for  instance,  obtain  the  Laplace  transform 
of  the  distribution  of  sums  of  the  random  variables.   This  is 
essential  if  the  process  is  to  be  used  as  a  model  for  point  pro- 
cesses and  one  is  to  obtain  the  point  spectrum. 

A  third  feature  of  the  EAR1  process  is  that  its  structure 
leads  to  an  extended  model  for  a  sequence  of  marginally  exponentially 
distributed  random  variables  with  the  correlation  structure  of  a 
mixed  autoregressive  moving  average  process  (ARMA  p,q),  the  orders 
of  the  autoregression  and  the  moving  average  being   p   and   q 
respectively.   Various  properties  of  this  exponential  process, 
called  the  EARMA(p,q)  process,  are  detailed  in  Lawrance  and  Lewis 
(1977),  Jacobs  and  Lewis  (1977)  and  Lawrance  and  Lewis  (1978). 

In  Section  2  of  this  paper  we  discuss  the  exponential 
solution  of  equation  (1.1),  as  well  as  questions  of  stationarity 
and  mixing.   The  distribution  of  sums  of  random  variables  from 
this  process  is  discussed  next;  this  relates  in  particular  to  the 
use  of  the  process  to  model  intervals  between  events  in  point 
processes.   Joint  distributions  and  conditional  correlations  of 
two  and  three  random  variables  are  derived  in  Section  4 .   The 
estimation  of  the  exponential  parameter   A,  the  reciprocal  of 
E(X  ),  and  the  correlation  parameter   p   are  briefly  considered 
in  Section  5  for  a  fixed  or  random  number  of  observed  random 
variables.   It  is  shown  that,  because  of  a  certain  degeneracy 
in  the'  process,   p  can  be  estimated  exactly  in  a  long  enough  sequence 
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Other  solutions  to  equation  (1.1)  are  considered  in  Section  6. 
In  fact  random  variables  for  which  solutions  to  (1.1)  (in  trans- 
formed version)  exist  are  known  as  self -decomposable  random 
variables  or  random  variables  of  type  L;  see  Feller  (19  71,  p. 
588-90) .   Two  cases  of  interest  in  modelling  point  processes 
are  discussed.   These  require  consideration  of  Gamma-distributed 
intervals,  and  intervals  with  a  mixed  exponential  distribution; 
questions  connected  with  the  latter  remain  unresolved. 

Finally  in  Section  7  we  consider  the  question  of  obtain- 
ing autoregressive  and  Markovian  exponential  sequences  with 
negative  correlation.   It  may  be  seen  that  (1.1)  has  no  solution 
if   P  is  negative.   However,  as  soon  as  multivariate  exponential 
sequences  are  considered — in  particular,  multivariate 
sequences  which  are  antithetic  realizations  of  each  other 
— we  discover  a  method  for  obtaining  processes  with  exponential 
or  other  specified  marginals  and  that  exhibit  negative 
correlation. 


(2.1) 


2.     EXPONENTIAL  SEQUENCES 

In  what  follows   {E  }   will  always  be  a  seouence  of  i.i.d 

n 

exponentially  distributed  random  variables  with  parameter   A: 


P{E   <  x)  =  F^  (x)  =  1  -  e  Ax     (x  >  0,   A>  0) 
n  —       -hj  — 

n 

=0  (x  <  0)  . 

Also  an  exponential  (A)  random  variable  will  mean  a  random 
variable  with  distribution  (2.1). 

2.1.   The  exponential  first-order  autoregressive  sequence  (EAR1) . 

The  starting  point  of  the  work  in  this  paper  is  the  question 
as  to  whether  the  autoregressive  equation  (1.7) 


X   =  pX   _  +  e   =   J   p:e  |p|  <  1  (2.2) 

n    w   n-1    n    ,Ln    K  n-i 

3  =  0       J 

has  a  solution  for  a  given  distribution  of   X.   (note  that  the 

expansion  in  (2.2)  as  an  infinite  moving  average  is  valid  because 

Ipl  <  1).   Now   X  _ ,   is  a  function  only  of   e   , ,  e   ~,  ...   and 

is  therefore  independent  of  e  .   Therefore  the  Laplace-Stielt jes 

transform  <\>       (s)   of  the  distribution  of   X    is,  at  least  for 

n  n 

s  >  0, 

4>x  (s)  =  E[exp(-sXn)]  =  E[exp(-spXn_1  +  e  )] 
n 

=  <$>x  (ps)   (j)    (s)   . 

n-1       n 


Thus 


<J>X  (s) 
n 


(2.3) 


Assuming  that  the   X    sequence  is  marginally  stationary,  we  get 


the  basic  equation 


♦  E(s)  = 


<J>x(ps) 


Now  it  is  clear  that  if  we  require  the   X    to  be  positive  random 

variables,  then  if   p   is  negative  so  is   pX    and  we  need  the 

error  term   e  ,  which  is  independent  of   X   ..  ,  to  make   X   positive 
n  ^  n-1  n  r 

Thus  for  positive  random  variables  there  will  clearly  be  no  solution 
to  (2.3)  for   p  <  0.   However  if  we  let   0  <  p  <  1   and, side- 
stepping the  general  question  of  existence  and  uniqueness  of 
solutions  to  (2.3) ,  require  the   X  ' s  to  be  exponential  with 


*x(s>  ■  r-r-r  - 


then  equation  (2.3)  forces 

A      A  +  ps    A  +  ps 


VS)  =  A  +  s 


A 


A  +  s 


(0  <  p  <  1,  A  >  0) 


=  p  +  d-p) 


A  +  s  * 


(2.4) 


The  latter  expression  is  the  Laplace-Stielt jes  transform  of  a 

random  variable,  in  fact  a  non-negative  random  variable  which 

has  an  atom  of  mass   p   at  zero  and  which  is  exponential ( A)  if 

positive.   Thus  we  can  write  the  difference  equation  generating 

the  series   {X  }   as 
n 


X   =  pX   -,  +  £ 
n      n-1     n 


(  pxn-: 


pX_  -,  W.p.  p 

0  <_  p  <  1  (2.5) 

(  P\_±    +  En     w.p.  (1-p) 


=   PXn-l  +  Vn'  (2'6) 


where   {I  }   is  an  i.i.d.  sequence  in  which   1=1   with  prob- 
n  n 

ability   (1-p) ,1   =0   with  probability  p   and   {En}   is  an 
i.i.d.  sequence  of   (A)   exponentials.   It  is  easily  verified, 
directly  from  (2.2)  or  from  the  definition  of   e  ,  that 


E(e  )  =  (1-P)  t  '  var(e  )  =  -5-  (1-P  )  .        (2-7) 

n  A  A 


There  are  several  points  to  be  made  about  the  sequence   Xn : 

(i)   When   p  =  0,  then   {X  }   is  an  i.i.d.  sequence  of 

exponential ( A)  r.v.s. 

(ii)   The  representations  (2.5)  and  (2.6)  of  the  process  as 

an  (additive)  random  linear  combination  of   X   ..   and   E  , 

n-1        n 

which  has  a  distribution  independent  of   p,  are  sometimes 

more  convenient  than  is  (2.2). 

(iii)   Although   X    is  strictly  a  linear  process  (AR1) ,  it 

differs  from  the  normal  or  Gaussian  process  ((2.2)  with 

e   normally  distributed  of  Gaussian)  in  that  the  mean, 
n  

variance  and  higher  moments  of  the   e  ' s  are  functions 


of  the  parameter  p.  Thus  one  cannot  apply  general  theorems  about 

linear  processes  to  the  exponential  sequence  because  such 

theorems  usually  assume  that  the   e  's  have  a  distribution 
J  n 

which  is  free  of  the  parameter   p.   As  an  example,  X    is 
exponentially  distributed  even  when  p  is  close  to  one, 
despite  theorems  for  linear  processes  which  assert  that 
in  this  case  the   X    are  approximately  normally  distributed. 
For  instance,  if  the   e  's  are  independent  of   p   then 

the  skewness  of  the  stationary  distribution  of   X    can 

■*  n 

be  shown  to  approach  zero  as   p  t  1;  in  the  special  case 
of  (2.4),   this  quantity  is  independent  of   p   and  equal 
to  2 — the  value  associated  with  the  exponential  d.f. 
(iv)   There  are  many  other  sequences  of  random  variables   {X  } 

with  exponentially  distributed  marginals  and  the  first-order 

Markov  property.   In  fact,  given  any  bivariate  exponential 

distribution   F„   „  (x, ,x0) ,  assumed  for  convenience  to 

be  absolutely  continuous,  with  conditional  density 

fE  IE  (x2;x-|)  '  tnen  we  construct  a  Markovian  sequence  with 

joint  density  for,  say,  X  ,  X  _,,...,  Xfi   as 

fX  ,  .  ..  ,Xn(xn"  '•  ,X0) 
n       0 


An  interesting  and  useful  feature  of  the  EAR1  sequence 
given  by  the  solution  (2.6)  of  the  autoregressive  equation  is 
that  it  is  a  (random)  linear  combination  of  i.i.d.  exponential 
sequences,  and  is  thus  easy  to  simulate  on  a  computer,  gives 
reasonably  tractable  analytic  results  and  in  turn  suggests  the 
definition  of  processes  with  more  complicated  correlation 
structure  and  exponential  marginals  (Lawrance  and  Lewis,  19  77, 
Jacobs  and  Lewis,  1977,  Lawrance  and  Lewis,  1978).   The  correlation 
structures  are  essentially  the  same  as  those  for  linear  processes 
(Lawrance  and  Lewis,  1978). 


2.2.   Serial  correlations,  stationarity  and  mixing 

Simple  computations  show  that,  as  is  true  of  any  regular 
Markov  process,  the  serial  correlations  for  the  EAR1  process  are 

c.  =  corr(X  X   . )  =  p11  ,       0  <  p  <  1. 

j  n   n+j     K  —  M 

Note  that  the  correlations  are  always  positive.   The  spectrum  of 
the  sequence  is 


oo 

:.  (W)    =   i-   {1    +    2      I      c  . 
+  *  j=l      3 


i  i  2 

1  1    -    p 


cos(jco)}  (0<w<tt;    0<p<l) 


IT  n        ■  2  „ 

1+p       -    2p    cos   u 


For   p  =  0   we  have   f  (co)  =  1/tt   since  the  sequence   X    is  i.i.d 

+       '  ^        n 


Moreover  it  is  not  difficult  to  show  that  the  process  is 
stationary  if  one  takes 


XQ  =  EQ  (2.8) 

X   =  pX   ,  +  £   =  pX   .  +  I  E  (2.9) 

n     n-1     n    K  n-1     n  n  ' 


with   £    defined  by  (2.4).   This  is  no  surprise,  for  the 
process  is  Markovian  and  is  constructed  to  have  an  exponential 
distribution  in  the  stationary  case. 

With   E   =  X„   it  is  shown  in  Jacobs  and  Lewis  (1977)  that 
the  sequence   {X  }   is  strong  mixing  in  the  sense  of  Rosenblatt 
(1971)  . 


3.   SUMS  OF  STATIONARY  INTERVALS 

An  important  aspect  of  the  stationary  sequence   {X  }, 

especially  when  it  is  used  to  model  the  intervals  in  a  point 

process,  is  the  moments  and  distributions  of  sums 

T   =  X   +  X   ,+•••+  X     , .   The  Laplace-Stielt jes  transform 
r     n     n+1         n+r-1  r 

of   T    is  found  directly  by  noting  that,  in  terms  of   X  . 
(2.2)  may  be  modified  to  give  this  representation: 

Xn+j  =  PJxn  +  Pj"l£n+1  +  PJ"2en+2  +-"+  en+j'   (j  =  0  ,1,2, . . . )   (3.l]j 

and  so 

r-1  / ,   r  \   r-1      /  .   r-j 

'  1-P     1  _L      V      £  (   1-P      J 


r    >0   n+:     n  \l-p  /   j£1   n+j  \  1-p 
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It    follows    that 


<j>       (s)    =   E[exp(-sT   )  ]    =    4>x 
r  n 


I-P" 
1-P 


r-1 

n     <$> 


i-Pr-j 
1-p 


(3.2) 


If  the  process  starts  with  the  stationary  distribution,  i.e., 
if   X    is  exponential ( A) ,  then 


<J>   (s)  = 
T 


A 


r-1  /  A  +  sp 

n 


-nr"^ 


1-p 


1-p 


i  +  s'pf  ^V-K2 


(3.3) 


Clearly  the  distribution  of   T    can  be  found  explicitly  by 

inverting  (3.3),  which  can  be  accomplished  by  expanding  in 

partial  fractions.   So  also  can  an  expression  for  the  interval 

spectrum.   The  result  is  analytically  awkward,  and  will  not 

be  quoted.   In  order  to  obtain  (i)  the  intensity  function,  m  (t)  , 

of  a  point  process  with  EAR1  intervals,   (ii)  the  point  spectrum 

g  (go)  ,  and   (iii)  the  distribution  of  the  counting  process   N(t) 

(for  definitions  see  Cox  and  Lewis,  1966,  Sec.  4.5)  one  must  be 

able  to  sum  <p       (s)   over   r;   there  appears  to  be  no  neat 

r 
explicit  formula  for  any  of  these  functions.   The  variance  of 

T   may  be  calculated  directly  from  (3.1)  since  the  innovations 
r  B£rt  i 


{c    }      are  independent: 


n. 
I+n 

Var[T    1    =  \ 


1-P 


r\2 


2  ( r-1-) 

1-P' 


+   ,1±P       r-l-2p   iz£_ 
1-p    /  \  1-p  /  /  \    1-p 


p2/l-p2^> 
V     l-o2 


(3.4) 


■  I)    Y^ clidfidoiq 
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p 


Hence  for   p  <  1   the  index  of  dispersion  (Cox  and  Lewis  (1966) , 


p.     71)     is 

th 


yfar  r¥r3]Pers  :   and   Lewis    (1966), 


J   -     lim    J      =     lim       ^—j  =   t        *   =   1   +   -f^ —  (3.5) 

r->oo      r        r^oo      r{E[X1]}/         X         p  l    ~    p 


1    -    p    ~    X    +   1^-77  (3.1) 


As  a  byproduct  of  (3.5)  we  get,  for   p  <  1,  values  of  the  slope 
of  the  variance  time  curve,  lim     V  (t)  =  V  (°°)  ,  the  initial 


I 


t  -*■ 

fo; 


points  "of  the  spectrum  of  counts  ;  "g  (do)  <  and  the.  speetth©  Slope 

ot  trie  variance  tin 

intervals  as  (Cox"  and  Lewis  ,  '1966  ,  s¥c£t)  4t6Y'  (°°)  ,  the  initial 

trum  of 
f  (o+)  =  fl?6  6,  Sect.  4.6) 

+  77 

V'.(oo)    =   -XJ 


g+(0+)  =  aj/tt 


Limit  theorems  for  T    and  the  counting  process   N(t)   are  given 
in  Jacobs  and  Lewis  (1977) . 

are  given 

4.   JOINT  DISTRIBUTIONS  FOR  THE  STATIONARY  SEQUENCE 

Any  pair   X  ,  X     in  the  stationary  EAR1  sequence  has 

a  bivariate  exponential  distribution.   Consider   X   and   X   n  . 
r  n        n+1 

Directly  from  the  definition  (2.5)  and  (2.7)  it  can  be  shown 

that  the  conditional  random  variable   X  ,  ,  ,  given   X  =  x,  has 

n+1   3       n 

an  atom  of  mass   p   at   px;  otherwise  it  is,  with  probability  (1-p)  , 

12 


? 


an  exponential ( A)  random  variable  shifted   px   from  the  origin 

The  regression  of   X  , ,   on   X   =  x   is 
^  n+1       n 


E(Xn+1IXn  =  x)  =   x  +  (1-p)  i  ,  (4.1) 


which  is  linear.   Moreover   var(X   ,  I X  =  x)  =  (1-p  ) /A  ,    a 

constant  independent  of   x,  and   X   ,   is  never  smaller  than 

)X  .   The  Laplace-Stieltjes  transform  of  the  bivariate  exponential 

distribution  of   X  ,  ,   and   X    is 

n+1        n 


5X  ;X  x.  (srS2}  =  ^{expf-s^  -  s2Xn+1)} 
n  n+1 


=  E{exp[-s1  +  ps2)Xn]}  <j>   (s2) 

n 

A (     ,,   v    _A ) 

A  +  s1  +  ps2  |P+  u~p;   A  +  s2  j 


=  P  ,    ^    ^    \    ^      +  (1-P) 


A  +  s,  +  ps2       K   \  A  +  s,  +  ps2/\A  +  s2 


(4.2) 


T  e  bivariate  distribution  of  {X   , ,  X  }  has  a  singular  component 
a  ong  the  line   X   ,  =   X  ,  with  probability   p  (the  first  term 
d  1  (4.2),  and  a  continuous  component  in  the  space  X     >  pX 
the  second  term  in  (4.2))  which  is  the  joint  distribution  of   X 

nd   pX   +  E  .   Note  that  the  distribution  is  not  symmetric  in 

n    n  2 

■    ,,   and  X  ,  since  the  transform  <$>v  v  (s,  ,s0)   is  not 

n+l        n  xn+l'n  l      Z 

symmetric  in   s,   and   s^.   This  simply  means  that  the  process 

is  not  time-reversible,  as  is  the  normal  or  Gaussian  ARl  process. 

This  asymmetry  appears  in  the  conditional  mean  of   X  , 
given   X   ,  =  x,  which  is  obtained  from  (4.2)  ,  as 

This  asymmetry  also  becomes  evident  in  the  non-linear  form 

of  the  conditional  mean  of   X  ,  given   X  . ,  =  x,  which  is  obtained 

n  n+i 
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from  (4.2),  as 

E(Xn|Xn+l=x)  =  Takr   {1  "  exp["  (?  "  1)Axl}  *   E[Xn+l|Xn=x]  ' 
The  conditional  variance  of   X  ,  given   X   ,  =  x,  can  also  be 

obtained;  unlike   var(X  . , |x   =  x)   it  is  not  a  constant. 

n+1   n 

Hiqher  order  directional  moments  of   X  , , ,  X    are  given 

n+1   n 

in  Jacobs  and  Lewis  (19  77) ,  and  again  evidence  the  directionality 
of  the  process: 


C9  ,  (1)  =  E(X^  X   .)  -  E(X*)  E(X    )  =  %    ; 
2,1         n   n+l       n     n+1     ,3 

A 

Clf2(l)  =  E(Xn  X2n+1)    -  E(Xn)  E(X2+1)  =  2p(^p) 

A 


Since  the  process  is,  from  its  definition,  Markovian,  the 

conditional  correlation  of   X   .   and  X  _, ,  given   X   =  x,  is 

zero.   We  do  not  discuss  distributions  of  triples  any  further. 

In  principle  it  is  simple  to  write  down  transforms  of  the  joint 

distribution  of  any  set  of  k   X  ' s ;  one  obvious  use  of  this 

n 

result  would  be  in  obtaining  the  distribution  of  the  sum  of 

X  ,  ...  ,  X  ,  _, .   However,  this  was  obtained  by  a  direct  argument 

in  the  previous  section. 


5.   ESTIMATION  AND  DEGENERACY 

Although  it  is  not  possible  to  write  down  the  likelihood 
equation  for  the  EAR1  process  in  a  tractable  form  where,  say,  we 
observe   X,,  ...  ,  X-j,  it  seems  at  first  sight  that  good  estimates 
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of   1/A   and   p   can  be  obtained  from  the  sample  mean,  X  =  T  /n , 
whose  variance  is  given  at  (3.4),  and  a  first-order  serial 
correlation  coefficient  estimate   p,   with  modifications  to  take 
account  of  the  fact  that  the  variance  equals  the  mean  squared, 
i.e.   Var(X)  =  1/A2. 

However,  there  is  a  degeneracy  in  the  process  which  makes 
it  possible  to  find   p   exactly  in  a  long  enough  run  of   X  's, 
and  to  estimate   A   more  precisely  than  is  possible  using 
straightforward  moment  methods. 

To  identify  the  degeneracy,  which  we  call  the  zero-defect, 
we  note  that  in  the  process  there  are  runs  of   X  's  which  are 
equal  to  the  previous  value,  X  _, ,  times   p.   Moreover,  given 
that  there  is  a  run  of  length   R  of  this  type,  R   has  a  "geometric 
plus  one"  distribution  with  parameter   p; 


P{R  =  i>  =  (1  -  pjp1  X  ,    i  =  1,2,3,  ...  (5.1) 


E(R)  =  T— ^ ;    var(R)  = 


1  "  p (i  -  pi2    " 

This  behavior,  or  a  tendency  towards  it  in  data  would  be  very 

evident  in  plots  of  a  sample  sequence  of   X   values. 

Thus  to  estimate   p   in  an  observed  series,  let   Z   =  X  -,/X    , 

n     n  +  1   n 

for   n  =  2 , 3,  ...  .   Then 


zn  =  2±i  =  p  if  xn+1  =  Pxn   w.p.   p 

n 

E   , 

=  P  +  -j£-±     if   Xn+1  +      pXn   w.p.  (1-p) 


n 
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The  probability  that   Z   =  p,  or   Z   ^  p,  is  independent  of  previous 

Z  's  and   P{Z   =  p}  =  p.   Hence  if  we  observe  the  minimum  value  of 
n  n 

Z  ,  n  =  2,3,...   until  we  get  a  tie,  that  tie  value  is   p.   The 
time  to  obtain  this  tie  is  the  sum  of  two  "geometric  plus  one" 

random  variables,  with  total  mean   2/p   and  total  variance 

2 

2(l-p)/p  .   The  implication  is  that  we  can  find   p   exactly  after 

a  random  number  of  observations  for  the  present  model.   Moreover 
since  one  then  knows  those   X  's  which  are  made  up  of   pX   , 
plus  an   E  ,  it  is  possible  to  compute  the   E  's  exactly  (except 
for  two)  and  estimate   E(X)  =  1/A   directly  from  the  observed  E  's. 

For  a  fixed  length  observation  of  the  process,  say 
X, ,  ...  ,  X^,  the  probability  that   p   can  be  obtained  exactly 
is  the  probability  that  the  sum  of  two  "geometric  plus  one" 
random  variables  is  less  than  or  equal  to   N.   If  two  minimum 
values  are  not  observed  by  time   N,  a  good  estimate  of   p   will 
be  the  minimum  of  the   Z.'s.   This  will  either  be   p ,  or   p   plus 
a  small  bias  term.   The  mean  can  be  estimated  as  the  sample 
average  in  the  usual  way. 

This  degeneracy  in  sequences  of  positive  random  variables 
generated  from  the  stochastic  difference  equation  seems  to  be 
inherent  in  the  present  procedure.   Slightly  more  realistic  but 
complicated  first-order  autoregressive  exponential  processes 
without  this  zero  defect  will  be  discussed  elsewhere. 
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6.   OTHER  SOLUTIONS  TO  THE  AUTOREGRESSIVE  EQUATION 

Random  variables   X   for  which  the  transformed  first-order 
autoregressive  equation 


<J>x(s) 


=  <j>  (s)  (6.1) 


4>x(ps)    ^e 


has  a  solution  <b    (s)   for  each   0  <  p  <  1   which  is  the  transform 

£ 

of  a  distribution  function  are  called  random  variables  of  class   L 
(Feller,  1971,  p.  588) ,  or  self-decomposable  random  variables. 
Although  there  has  been  much  recent  interest  in  finding  class   L 
random  variables,  the  connection  with  the  first-order  autoregressive 
process  does  not  seem  to  have  been  made.   On  the  other  hand  there 
have  been  some  attempts  to  find  non-normal  solutions  to  the  auto- 
regressive equation  without  connecting  them  to  class  L  theory 
(see,  e.g.  Bernier,  1970). 

The  limitation  of  the  theory  of  class  L  random  variables  as  it 
relates  to  the  present  work  is  that  it  requires  a  solution  for  each 
0  <  p  <  1,  which  is  the  case  for  exponential (A)  random  variables   X. 
This  full  range  of   p   is  desirable,  but  may  not  occur. 
We  do  not  explore  the  full  connection  here,  but  consider  only 
two  types  of  random  variables,  Gamma  distributed  random 
variables  and  mixed  exponential  random  variables.   These  are 
frequently  used  as  alternatives  to  the  exponential  in  modelling 
stochastic  phenomena  such  as  response  times  in  queues. 
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6.1.   The  Gamma  Autoregressiye  Process  GARl . 

A  necessary  condition  for   X   to  be  in  the  class   L   is 
that  it  be  infinitely  divisible,  and  therefore  we  consider 
Gamma  (A,k)   variables  with  density 

k  k-1  - 
fv(x)  =  --^ (A>0;k>0;x>0)  (6.2) 

x      kkr(k) 

and  Laplace-Stielt jes  transform  (of  the  stationary  distribution) 

Vs)  ■  (rre)'   •  (6-3> 

Then  the  solution  to  (6.1)  is 

V  =  (H-S1)*-   (p+  (1"p)  rrrf  <6-4> 

It  follows  from  a  result  reported  by  Feller  ((1971),  p.  452)  or 
directly  from  Theorem  1,  p.  450)  that   [(A  +  ps)/(A  +  s) ]   is 
the  transform  of  an  infinitely  divisible  distribution,  and  hence 
(6.3)  is  actually  the  transform  of  (an  infinitely  divisible)  distri- 
bution function  for  every  real   k  >  0.   Thus  we  can  in  principle 
generate  an  autoregressive  process  with  gamma  marginals  (6.2) 
by  utilizing  the   {e  }   process  characterized  by  (6.3).   Here 
are  a  few  simple  special  cases. 

k  =  2:   d>e(s)  =  P2  +  2p(l-p)  £-£-y  +  (1-p)2  f  j4t)  ;         (6*5) 
k  =  3:   tj>  (s)  =  p3  +  2p2(l-p)  a    *  .     +  2p(l-p)2 


+  (1"p)3(i4r)3  •  (6-6) 
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Thus  for  k  =  2   the  random  variable  e  may  be  thought  of  and 
realized  as  a  convex  mixture  of  a  degenerate  random  variable 
with  mass  at  zero  and  an  exponential ( A)  and  a  Gamma  (A, 2) 
distribution.   Alternatively  if   E.   and   E_   are  independent 

exponential ( A)  random  variables,  the  sum  is  chosen  with  probability 

2  2 

(1-p)  ,  E,   is  chosen  with  probability   2p(l-p)  ,  and  neither  of 

2 
E,   or   E„,  but  rather  zero,  is  chosen  with  probability   p  . 

Unfortunately,  no  really  simple  way  of  generating  random  vari 

ables  with  Laplace-Stielt jes  transform  (6.4)  for  noninteger   k   is 

known,  though  representation  as  an  infinite  sum  of  weighted 
Gamma  variables  is  possible  (Bernier  (1970)).   Moreover  for  a 
case  of  much  interest   k  <  1,  so  that   X   is  overdispersed 
relative  to  an  exponential  random  variable ,    and  the  degeneracy 
or  zero-defect  of  weight   p    becomes  very  prominent. 

As  was  true  for  the  exponential  case,  we  have  for  the 
correlations,  which  are  all  positive, 

c.  =   p3  0  <  p  <  1; 

:  - 

also 

E(e)  =  A(l-p) 

Var(e)  =  A2(l-p2)/k  . 

As   k  ■*  °°,  the  degeneracy  disappears  and   X  tends  to  become 
normally  distributed. 
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6.2.   The  Mixed  Exponential  Process  MEAR1 

Consider  now   X   to  be  a  mixture  of  two  exponential  densities 

-A,x         ~A?X 
fx(x)  =  ^A^      +  iT2A2e      ' 

x  >  °'   ^i  =  1  "  v2   -   °'   Al  <  A2  *  (6.7) 

For   7^  =  0,   TTj^  =  1 ,  and/or  X±   =  A2   this  of  course  reduces 

to  an  exponential  density.   There  are  two  main  cases  to  consider, 

besides  these  special  cases. 

If   0  <_  7T,  <_  If  then  we  always  have  a  convex  mixture  of 
exponential  densities:   fv(x)   is  a  proper  density  function  and 
the  coefficient  of  variation  of   X   lies  between  1  and  infinity 
(Cox,  1964) ,  so  that   X   is  overdispersed  relative  to  an 
exponential  random  variable;  X  may  be  generated  as  a  mixture 
of  two  exponential  random  variables  with  parameters   A,   and 
A2,  and   X   is  infinitely  divisible  (Feller  (1971),  p.  452). 

If   7T,   is  greater  than  1,  so  that   tt„   is  negative,  a 
necessary  and  sufficient  condition  for   fv(x)   to  be  a  p.d.f. 
is  that   ^-i^-i  +  7T2^2  —  ^   or  ec3uivalently  that  tt,  £  [1  -  (A,/A~)] 

Here  is  alternative  way  of  writing  the  transform  of   X. 
First,  and  directly, 

Al  A9 

X        1  A1  +  s     2A2+s  (6.8) 

Now  since   A2  >  X±      factor  to  obtain 
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Vs>  -  xr 


2     (  Ai  {  h**\\ 

^(*2  +  *iaj  \rprs)\ 


^|'l-'i,(l-   y^^-^xi^J- 


x2 


Now  even   if      tt      >    1    (ti      <    0)    but 
1  2  / 


(6.9) 


Y    =    -1    I1    "    Xj)   <    X    '  (6'10) 


or   equivalently 


7T2X2    +   7T1A1    >    °  (6.11) 


then   <{>x(s)   is  the  transform  of  the  sum  of  an  exponential (A  ) 
random  variable  and  a  random  variable  having  the  distribution 
of  the  e-innovation  of  an  EAR1  process,  see  (2.4).   It  follows 
that   X   is  infinitely  divisible. 
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It  is  thus  worth  investigating  whether  one  can  generate 
an  autoregressive  process   (X  }   with  mixed  exponential 
marginals  for  some  or  all   p,  the  latter  being  the  question 
as  to  whether  the  mixed  exponential  random  variable  is  in  the 
class   L.   We  have,  directly 


Vl  7r2A2 


$       (S)      =     -7 7 r 

Ye  <J>V  (ps) 


<J>  „(s)  s    +X.         s   +A 


X  ttA,  7r2^2 


ps    +    A-i  ps    +    A 


(6.12) 


(A,  +  ps)  (A   +  ps)  (A,  A2  +  7T  A  s  +  7T  A2s) 
(A,  +  s)  (X      +   s)  (A  A2  +  7r1A1ps  +  7r2A2ps) 


After  considerable  simplification  this  Laplace-Stielt jes 
transform  can  be  written  as  a  mixture  of  a  degenerate  random 
variable  with  mass   p  at  zero,  and  with  probability  (1-p) 
(possibly)  a  random  variable   Y  which  has  (possibly)  density 
function 


fy(x)  =  YxA1e   X   +  y2A2e   2   +  y3  g-  e"xc/p  (6.13) 


where    y„  +  y   +  y      =  1,  A   <  A  ,  tt ,>  0,  tt  ^  1   and 


c  =     Xl"2 


7T1A1  +   (1-7T1)  A2 
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(A2  -  pA1) (C  -  Xx) 


v   =  -± i i-  =  p  ;                                     (6.14) 

1  (A2  -  X1) (C-  pA1) 

(A_  -  C)     (pA„  -  A, )  (A  -p\~) 

Y   =  £ x  f ± =  p   _ ± = ;   (C  ji    pA  )  (6.15) 

2  (A2  -  A1)   (pA2  -   C)  (G  -  pA2) 

(C  -  A  )  (A   -  C)  , 

Y3 rcT^l X  (C  -  PA2)  =  P3  X  (C  -  pA2)  '•            (6'16) 


Now  if   7T    is  restricted  to  be   0  <  tt   <  1 ,  then    A,  <  C  <  X 
and   p,  >  0,  p   >  0,  p_  >  0.   Then  it  can  be  shown  that  the 


s 


igns  of   Y-.  /  Yp>   Y-^   depend  on  whether   0  <  p  <_  A,/A~, 


A,/A2   <  p  <  C/A2   or   C/A2  <  p  <  1: 


(i)  Y-i  is  always  positive; 

(ii)  y2  an<^  y r>      are  positive  if   0  <  p  <_  A./A-; 

(iii)  y2  ^s  negative,  y -      is  positive  if  \^/X2    <  p  <  C/X    ; 

(iv)  y2  ^s  positive,  Yo   1S  negative  if   C/A?  <  p  <  1. 


While  this  result  is  relatively  simple,  the  only  case  in  which  it 
has  been  possible  to  establish  that   fy(x)   is  a  p.d.f.  is  that 
in  (ii)  ,  i.e.   0  <  p  <_  A,/A~.   The  ratio    A,/A?   is  roughly 
related  to  the  overdispersion  of   X  relative  to  the  exponential 
distribution;  the  smaller  the  value   A./A„,  the  greater  the 
dispersion  and  the  smaller  the  admissible  range  of   p.   It  seems 
probable  that   fy(x)   is  not  a  density  function  for  all    P;  at 
all  events  efforts  to  prove  that  it  is  in  the  class   L   using 
(6.10)  and  characterization  theorems  (Feller,  1971)  have  failed 
at  the  time  of  this  writing.   A  complete  understanding  is  not  now 
available.  23 


6.3.   Discussion 

Note  the  differences  between  the  mixed  exponential  auto- 
regressive  sequence  and  the  Gamma  (k  <  1)  autoregressive  sequence; 
the  former  appears  to  require  a  restricted  range  for   p,  e   can  be 
simulated,  and  the  zero  defect  has  probability   P;  the  latter 
sequence  is  valid  for   0  <_  p  <  1 ,   £  cannot  at  present  be 
simulated,  and  the  zero-defect  has  probability  p   >  p  • 
The  magnitude  of  this  zero-defect  alone  probably  makes  the 
Gamma  process  less  useful  than  the  mixed  exponential  process. 
The  tail  behavior  of  the  marginal  distributions  is  also 
different. 

Laplace-Stielt jes  transforms  of  distribution  of  sums, 

T    of   X  's  are  obtained  from  (3.2)  for  both  the  Gamma  and 

rl       n 
mixed  exponential  autoregressive  sequences.   These  cases  are 

still  under  study. 
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7.  MULTIVARIATE  SEQUENCES  AND  NEGATIVE  CORRELATION 

Even  though  the  autoregressive  equation  (1.1)  has  no  solution 

representing  a  positive  random  sequence  when    p  <  0   it  is  possible 

to  modify  it  so  as  to  produce  negatively  correlated  exponential 

sequences.   These  will  not  be  individually  Markovian,  but  will  have 

Markovian  properties  in  a  bivariate  sense.   A  suggestion  as  to  how 

to  proceed  comes  from  writing   X    in  terms  of  the  innovations 
^  ^    n 

{e    }       as    in    (2 . 2)  : 
n 

n  2 

X      =      Y      p^e  =e      +pe       n+pe      t    +  •••  +    p    e       .  (7.1) 

n         .L      M      n-n  n        K    n-1         K      n-2  Kn   0 

j=0  J 

Now  consider  generating  two  (or  more)  series  in  parallel,  using 

independent  pairs  of  innovations,  (e  ,  £*) ,  for  which  if  desired 

Cor(e  ,  e  '  )  <  0;  e.g.  for   0  <  p  <  1   and  starting  with  the  first 
%  n   n     '  — 

innovation. 


2         3 
Model  1:   Xn  =  pX^  +  en  =  en  +  pe^   +  p  en_2  +  p  e^  +•••  ^  ^ 

XA  "  pXn-l  +  EA  "  £A+p£n-l  +  p2eA-2  +  <=\-3    +"" 


2         3 

Model  2:   X  =pX   .  +  e   =  e  +  p£  ,    +  P  £„  0  +    P  £     +  • • • 
n     n-1     n    n    n-1      n-2      n-3 

2  ,       3  .         (7.30 

X'  =pX'    +£'=£'+  p£'  t  +  P  e '  0  +  p  £ '  .,  +•  •  • 

n     n-1     n    n    n-1      n-2      n-3 
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In  order  for   {X  }   and   {X1}   to  exhibit  asymptotically 

n  n 

exponential  marginal  distribution,  it  is  clear  that  the  marginal 

innovation  sequences  {e    }   and   {e'}   must  themselves  be  iid 
^  n  n 

with  an  atom   p   at  zero,  and  otherwise  an  exponential  distribution. 
This  can  be  accomplished  in  several  ways;  here  are  two: 

(i)   Let 

e   =  £  *  =  0  with  probability   p,  0  <_  p  <1 


e   =  Em  =  -  y  £n(U  ) 
n    n      An 

£'  =  E'  =  -  ^  £n(l-U  ) 
n    n      An 


(7.4) 
with  probability  1-p 


where   {U  }   is  a  sequence  of  uniform  (0,1)  "random  numbers"; 


n 


{E  }   and   {E1}   are  then  called  antithetic ,  see  Hammersley 

and  Handscomb  (1964),  and  are  maximally  negatively  correlated, 

2 

having  correlation   1  -  (tt  /6)  =  -  0.6449;  see  Moran  (1967). 

Each  series  in  Models  1,  2  receive  independent  innovations  of 
exactly  the  type  described  in  Section  2  and  that  were  there 
shown  to  lead  to  exponential  marginals  as   n  ■*■   °°.   In  this 
case 

E[£,£'J  =  (1-p)  4r  /   £n  u  £n(l-u)du  =  (1-p)  \  \2    -   ~-l   (7.5) 

A2  0  A2  L     6  J 

and 

Cov(£,£')  =  (1-p)  -T  [2  "  T-]  "  U"2)  (7*6) 

Also      Cov(£    ,£')     >    0      if       d    >     (tt2/6    -    1). 
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(ii)   If  the  distribution  function  of   c    is   F   ,  then  in  true 

n        en 
antithetic  fashion  determine 


£   =  F  1(U  ) ;  £'  =  F  1(1-U  )  (7.7) 

n    e   n  n    e     n 


In  particular,  if 


if   x  <  0 


F£  (x)  =  <  p  if   x  :  0  (7.8) 

p  +  (1-p) (l-e"Ax)    if   x  >  0 


Then  innovation  leading  to  marginal  exponentials (A)  are 
obtained;  see  Section  2.   In  this  case 


£=0  if   U   <  p;   e'  =  0  if   U   >  1-p; 

n  n  —      n  n  — 

A    \  1  -  p  /        n    K  A     \  1  -  p  ' 

(7.9) 


Consequently 


l*.W    =\     'An  (^)  .n(i^)du  , 


-±2         J  .  \ 

=   0  ,  \   <  p  £  1 


(7.10) 


and  hence 
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1-p  2 

Cov(en,e^)  =  i   /  An  ^  An  ^  du  -  ^   f      0  <_   p  £  | 

A     p 

l-   2  l 

=  -  .^f-        ,  j   <  P  £  1  .  (7.11) 


Scheme  (ii)  is  capable  of  generating  negative  correlations 

of  greater  magnitude  than  is  Scheme  (i),  as  is  clear  from 

examining  the  situation   1/2  <  p . 

The  expansions  (7.2)  and  (7.3)  allow  the  computation  of 

the  lagged  cross  covariances  between   X   and   X' .   For 
^  n        n 

n  -*-  °°  we  find 


Model  1:   Cov(X  ^.,X')  ~  -£-*■  Cov(e,e')  ,   j  =  0,1,2,...        (7.12) 

n+i   n     ,2  J 

1-p 

j     |Cov(e,e')   for   j  =  0,2,4,... 
Model  2:   Cov(X   .,X')  ~  -&-rj   x  (7.13) 

n    :      n  1-p  (Var(e)  for      j    =    1,3,5,... 


Of   interest    is    the    fact   that    the    series    of  Model   2    may   in 
fact   exhibit  negative    correlations;    for,    directly    from    (7.2), 

and   as      n   ■>   °°, 

Cov(Xn+l'V    =  Cov(XA+l'XA)    ^Cov(£'£,) 

1-p 

2 

Cov(X         ,X    )    =   Cov(X'        ,X')     --£---  Var(e) 
n+z      n  n+z      n  ,       /. 

1-p 


j  I   Cov(e,e' ) ,  j    =    1,3,5, . . . 

Cov(X   ^.,X    )    =    Cov(X'.,X')     r 1>  x     < 

n+]      n                      n+n  '    n           ,2  I  „      #    »                          o    >■    *• 

J                                   J                  1-p  (Var(e),  3    =    2,4,6,... 
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Consequently  if  scheme  (i)  or  (ii)  above,  or  a  multitude  of  other 
possibilities,  generate  the  innovation  pairs,  then  the  odd-numbered 
covariances  will  in  fact  be  negative. 

Clearly  the  above  generation  scheme  can  be  generalized, 
e.g.  by  including  more  equations,  {X  },  {X1 } ,  {X'},  etc.,  and  by 
allowing  the  innovations  to  have  different  marginal  properties. 
Further  investigations  are  planned. 
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8.   SUMMARY  AND  CONCLUSIONS 

We  have  presented  a  simple,  autoregressive  Markovian  sequence 

{X  }   of  exponential  variates  EARl  which  is  an  additive  random 
n 

linear  combination  of  the  previous  value,  X   .,  and  an  independent 
exponential  random  variable.   The  simplicity  of  this  structure 
allows  one  to  model  in  an  intuitive  way  dependencies  in  stochastic 
systems.   Jacobs  (19  78)  has  considered  cyclic  queues  with  EARl 
service  times  and  found  that  the  correlation  may  produce  a 
significant  effect;  more  general  queueing  schemes  which  generate 
multivariate  exponential  sequences  are  given  by  Lewis  and  Shedler 
(1978) . 

Maxima  of  the   X    in  the  EARl  process  have  been  studied 
by  Chernick  (1977) . 

Two  other  marginal  distributions  for  the   X  's  have  been 
studied,  the  Gamma  distribution  and  the  mixed  exponential.   The 
former  is  known  to  be  a  type   L,  i.e.  satisfies  (  6.1)  for  all 
p,  while  the  mixed  exponential  appears  to  satisfy  (6.1)  only  for 
a  limited  range  of   p.   Other  type  L   distributions  will  be 
investigated  in  the  context  of  the  modelling  of  independent 
stochastic  sequences  elsewhere. 

Extensions  of  the  first-order  autoregressive  structure 
for  exponential  marginals  to  higher-order  autoregressions , 
moving  averages  and  mixed  autoregressive-moving  average  structures 
has  been  given  by  Lawrance  and  Lewis  (1977),  Jacobs  and  Lewis 
(1977)  and  Lawrance  and  Lewis  (1978) .   The  possibility  of  extending 
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the  EARl  structure  depends  on  the  fact  that  the   { e  }   error 

n 

sequence  is  a  mixture  of  a  random  variable  with  mass  at  zero 
and  an  exponential  distribution. 

Three  other  possibilities  will  be  detailed  elsewhere-- 
a   slight  complication  which  brings  in  another  parameter  but  gets 
rid  of  the  "zero-deficiency"  in  the  EARl  process;  extension  of 
mixed  correlation  structures  to  give  negative  correlations, 
and  further  multivariate  extensions. 

Finally  we  note  that  it  is  possible  to  introduce  non- 
stationarity  and  dependence  on  concomitant  variables  into  the 
sequence  by  multiplying   X    by,  say, 

r 
A (n)  =  exp{  I    a . z  .  (n) } , 
j  =  0  3  ] 

as  was  done  in  Cox  and  Lewis  (1966,  Ch.  3,  ii).   It  would 

be  of  interest  to  see  how  the  methods  given  in  Cox  and  Lewis 

extend  to  the  case  where  there  is  correlation  present  between 

the   X  ' s . 
n 
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